28 research outputs found

    Phylotastic! Making tree-of-life knowledge accessible, reusable and convenient

    Full text link
    Abstract Background Scientists rarely reuse expert knowledge of phylogeny, in spite of years of effort to assemble a great “Tree of Life” (ToL). A notable exception involves the use of Phylomatic, which provides tools to generate custom phylogenies from a large, pre-computed, expert phylogeny of plant taxa. This suggests great potential for a more generalized system that, starting with a query consisting of a list of any known species, would rectify non-standard names, identify expert phylogenies containing the implicated taxa, prune away unneeded parts, and supply branch lengths and annotations, resulting in a custom phylogeny suited to the user’s needs. Such a system could become a sustainable community resource if implemented as a distributed system of loosely coupled parts that interact through clearly defined interfaces. Results With the aim of building such a “phylotastic” system, the NESCent Hackathons, Interoperability, Phylogenies (HIP) working group recruited 2 dozen scientist-programmers to a weeklong programming hackathon in June 2012. During the hackathon (and a three-month follow-up period), 5 teams produced designs, implementations, documentation, presentations, and tests including: (1) a generalized scheme for integrating components; (2) proof-of-concept pruners and controllers; (3) a meta-API for taxonomic name resolution services; (4) a system for storing, finding, and retrieving phylogenies using semantic web technologies for data exchange, storage, and querying; (5) an innovative new service, DateLife.org, which synthesizes pre-computed, time-calibrated phylogenies to assign ages to nodes; and (6) demonstration projects. These outcomes are accessible via a public code repository (GitHub.com), a website ( http://www.phylotastic.org ), and a server image. Conclusions Approximately 9 person-months of effort (centered on a software development hackathon) resulted in the design and implementation of proof-of-concept software for 4 core phylotastic components, 3 controllers, and 3 end-user demonstration tools. While these products have substantial limitations, they suggest considerable potential for a distributed system that makes phylogenetic knowledge readily accessible in computable form. Widespread use of phylotastic systems will create an electronic marketplace for sharing phylogenetic knowledge that will spur innovation in other areas of the ToL enterprise, such as annotation of sources and methods and third-party methods of quality assessment.http://deepblue.lib.umich.edu/bitstream/2027.42/112888/1/12859_2013_Article_5897.pd

    Data access for the 1,000 Plants (1KP) project

    Get PDF
    © 2014 Matasci et al.; licensee BioMed Central Ltd. The 1,000 plants (1KP) project is an international multi-disciplinary consortium that has generated transcriptome data from over 1,000 plant species, with exemplars for all of the major lineages across the Viridiplantae (green plants) clade. Here, we describe how to access the data used in a phylogenomics analysis of the first 85 species, and how to visualize our gene and species trees. Users can develop computational pipelines to analyse these data, in conjunction with data of their own that they can upload. Computationally estimated protein-protein interactions and biochemical pathways can be visualized at another site. Finally, we comment on our future plans and how they fit within this scalable system for the dissemination, visualization, and analysis of large multi-species data sets

    Phylotranscriptomic analysis of the origin and early diversification of land plants

    Get PDF
    Reconstructing the origin and evolution of land plants and their algal relatives is a fundamental problem in plant phylogenetics, and is essential for understanding how critical adaptations arose, including the embryo, vascular tissue, seeds, and flowers. Despite advances inmolecular systematics, some hypotheses of relationships remain weakly resolved. Inferring deep phylogenies with bouts of rapid diversification can be problematic; however, genome-scale data should significantly increase the number of informative characters for analyses. Recent phylogenomic reconstructions focused on the major divergences of plants have resulted in promising but inconsistent results. One limitation is sparse taxon sampling, likely resulting from the difficulty and cost of data generation. To address this limitation, transcriptome data for 92 streptophyte taxa were generated and analyzed along with 11 published plant genome sequences. Phylogenetic reconstructions were conducted using up to 852 nuclear genes and 1,701,170 aligned sites. Sixty-nine analyses were performed to test the robustness of phylogenetic inferences to permutations of the datamatrix or to phylogeneticmethod, including supermatrix, supertree, and coalescent-based approaches, maximumlikelihood and Bayesian methods, partitioned and unpartitioned analyses, and amino acid versus DNA alignments. Among other results, we find robust support for a sister-group relationship between land plants and one group of streptophyte green algae, the Zygnematophyceae. Strong and robust support for a clade comprising liverworts and mosses is inconsistent with a widely accepted view of early land plant evolution, and suggests that phylogenetic hypotheses used to understand the evolution of fundamental plant traits should be reevaluated

    The taxonomic name resolution service : an online tool for automated standardization of plant names

    Get PDF
    © The Author(s), 2013. This article is distributed under the terms of the Creative Commons Attribution License. The definitive version was published in BMC Bioinformatics 14 (2013): 16, doi:10.1186/1471-2105-14-16.The digitization of biodiversity data is leading to the widespread application of taxon names that are superfluous, ambiguous or incorrect, resulting in mismatched records and inflated species numbers. The ultimate consequences of misspelled names and bad taxonomy are erroneous scientific conclusions and faulty policy decisions. The lack of tools for correcting this ‘names problem’ has become a fundamental obstacle to integrating disparate data sources and advancing the progress of biodiversity science. The TNRS, or Taxonomic Name Resolution Service, is an online application for automated and user-supervised standardization of plant scientific names. The TNRS builds upon and extends existing open-source applications for name parsing and fuzzy matching. Names are standardized against multiple reference taxonomies, including the Missouri Botanical Garden's Tropicos database. Capable of processing thousands of names in a single operation, the TNRS parses and corrects misspelled names and authorities, standardizes variant spellings, and converts nomenclatural synonyms to accepted names. Family names can be included to increase match accuracy and resolve many types of homonyms. Partial matching of higher taxa combined with extraction of annotations, accession numbers and morphospecies allows the TNRS to standardize taxonomy across a broad range of active and legacy datasets. We show how the TNRS can resolve many forms of taxonomic semantic heterogeneity, correct spelling errors and eliminate spurious names. As a result, the TNRS can aid the integration of disparate biological datasets. Although the TNRS was developed to aid in standardizing plant names, its underlying algorithms and design can be extended to all organisms and nomenclatural codes. The TNRS is accessible via a web interface at http://tnrs.iplantcollaborative.org/ webcite and as a RESTful web service and application programming interface. Source code is available at https://github.com/iPlantCollaborativeOpenSource/TNRS/ webcite.BJE was supported by NSF grant DBI 0850373 and TR by CSIRO Marine and Atmospheric Research, Australia,. BB and BJE acknowledge early financial support from Conservation International and TEAM who funded the development of early prototypes of taxonomic name resolution. The iPlant Collaborative (http://www.iplantcollaborative.org) is funded by a grant from the National Science Foundation (#DBI-0735191)

    The iPlant Collaborative: Cyberinfrastructure for Plant Biology

    Get PDF
    The iPlant Collaborative (iPlant) is a United States National Science Foundation (NSF) funded project that aims to create an innovative, comprehensive, and foundational cyberinfrastructure in support of plant biology research (PSCIC, 2006). iPlant is developing cyberinfrastructure that uniquely enables scientists throughout the diverse fields that comprise plant biology to address Grand Challenges in new ways, to stimulate and facilitate cross-disciplinary research, to promote biology and computer science research interactions, and to train the next generation of scientists on the use of cyberinfrastructure in research and education. Meeting humanity's projected demands for agricultural and forest products and the expectation that natural ecosystems be managed sustainably will require synergies from the application of information technologies. The iPlant cyberinfrastructure design is based on an unprecedented period of research community input, and leverages developments in high-performance computing, data storage, and cyberinfrastructure for the physical sciences. iPlant is an open-source project with application programming interfaces that allow the community to extend the infrastructure to meet its needs. iPlant is sponsoring community-driven workshops addressing specific scientific questions via analysis tool integration and hypothesis testing. These workshops teach researchers how to add bioinformatics tools and/or datasets into the iPlant cyberinfrastructure enabling plant scientists to perform complex analyses on large datasets without the need to master the command-line or high-performance computational services

    Data access for the 1,000 Plants (1KP) project.

    No full text

    Immune biomarkers associated with COVID-19 disease severity in an urban, hospitalized population

    No full text
    Objectives: We sought to identify immune biomarkers associated with severe Coronavirus disease 2019 (COVID-19) in patients admitted to a large urban hospital during the early phase of the SARS-CoV-2 pandemic. Design: The study population consisted of SARS-CoV-2 positive subjects admitted for COVID-19 (n = 58) or controls (n = 14) at the Los Angeles County University of Southern California Medical Center between April 2020 through December 2020. Immunologic markers including chemokine/cytokines (IL-6, IL-8, IL-10, IP-10, MCP-1, TNF-α) and serologic markers against SARS-CoV-2 antigens (including spike subunits S1 and S2, receptor binding domain, and nucleocapsid) were assessed in serum collected on the day of admission using bead-based multiplex immunoassay panels. Results: We observed that body mass index (BMI) and SARS-CoV-2 antibodies were significantly elevated in patients with the highest COVID-19 disease severity. IP-10 was significantly elevated in COVID-19 patients and was associated with increased SARS-CoV-2 antibodies. Interactions among all available variables on COVID-19 disease severity were explored using a linear support vector machine model which supported the importance of BMI and SARS-CoV-2 antibodies. Conclusions: Our results confirm the known adverse association of BMI on COVID-19 severity and suggest that IP-10 and SARS-CoV-2 antibodies could be useful to identify patients most likely to experience the most severe forms of the disease
    corecore